Case Study #2 - Customer Orders Dataset

Created by: Srikrishnan Veeraraghavan

Suspecting that the text values in customer_email has extra leading spaces which might cause issues in calculations.
There are some whitespaces in the email column that have now been removed. This will ensure more accurate calculations.



Moving on to the problem statements:

1) Total revenue for the current year

2) New Customer Revenue e.g., new customers not present in previous year only

3) Existing Customer Growth. To calculate this, use the Revenue of existing customers for current year –(minus) Revenue of existing customers from the previous year

and

5) Existing Customer Revenue Current Year

and

6) Existing Customer Revenue Previous Year

I'm not completely clear on the question 3. I have assumed that the question means: (Revenue of non new customers in current year) - (Revenue of those particular customers from the previous year)

4) Revenue lost from attrition

and

10) Lost Customers

5) Existing Customer Revenue Current Year

Refer question 3

6) Existing Customer Revenue Previous Year

Refer question 3

7) Total Customers Current Year

and

8) Total Customers Previous Year

9) New Customers

10) Lost Customers

Refer question 4



Insights from Data

Attrition Rate of Customers

The attrition rate of customers is quite high! It is decreasing in 2016 and it would be interesting to observe this trend over a longer period of time and identifying the reason for the high attrition.

New vs Existing Customer Proportion and Revenue

The percentage of new customers and the percentage of revenue from new customers is very similar. This indicates that there is no real value to retaining existing customers as long as new customers are sourced to replace the ones attriting.

Distribution of Net Revenue of existing customers.

We can see that the distribution of revenue is almost a uniform distribution, which is surprising as one would expect a quantity like this to be normally distributed.